Assessing women’s preferences towards tests that may reveal uncertain results from prenatal genomic testing: Development of attributes for a discrete choice experiment, using a mixed-methods design

Prenatal DNA tests, such as chromosomal microarray analysis or exome sequencing, increase the likelihood of receiving a diagnosis when fetal structural anomalies are identified. However, some parents will receive uncertain results such as variants of uncertain significance and secondary findings. We aimed to develop a set of attributes and associated levels for a discrete-choice experiment (DCE) that will examine parents’ preferences for tests that may reveal uncertain test results. A two phase mixed-methods approach was used to develop attributes for the DCE. In Phase 1, a “long list” of candidate attributes were identified via two approaches: 1) a systematic review of the literature around parental experiences of uncertainty following prenatal testing; 2) 16 semi-structured interviews with parents who had experienced uncertainty during pregnancy and 25 health professionals who return uncertain prenatal results. In Phase 2, a quantitative scoring exercise with parents prioritised the candidate attributes. Clinically appropriate levels for each attribute were then developed. A final set of five attributes and levels were identified: likelihood of getting a result, reporting of variants of uncertain significance, reporting of secondary findings, time taken to receive results, and who tells you about your result. These attributes will be used in an international DCE study to investigate preferences and differences across countries. This research will inform best practice for professionals supporting parents to manage uncertainty in the prenatal setting.

When developing DCEs, attributes should be selected that reflect the essential characteristics of the product or intervention, are considered important, are understandable and are mutually exclusive [24]. The number of attributes chosen should be a manageable number; most DCEs present between four to eight attributes [25]. Too many attributes increases the complexity of the task for respondents which may increase the chance of inconsistent responses across choice tasks or responders not considering all the attributes when making a decision [26]. Additionally, appropriate levels that are deemed "plausible, and capable of being traded" must be defined [25,27]. Several methods can be applied to develop attributes and levels, including literature reviews, focus groups, interviews or consultations with key stakeholders, patient surveys, and expert reviews [25]. The importance of qualitative work when developing DCE attributes has been emphasised [24,25,28]. Notably, guidance on the conduct of DCEs has highlighted the lack of rigour in reporting attribute development [24,25,29,30].
In this paper, we describe the use of both qualitative and quantitative methods to develop DCE attributes for an international comparison study that will examine patient preferences for receiving uncertain genomic test results in the prenatal setting.

Materials and methods
A clinical advisory group (five HPs with expertise in prenatal genomics and fetal medicine from the UK, USA, Australia and Singapore) provided input into attribute development and assignment of clinically relevant levels. Ethical approval for this study was granted by the UK National Health Service Health Research Authority London-Riverside. REC reference: 18/LO/ 2120. Written consent was provided by those participants taking part in interviews conducted face-to-face; verbal consent (approved by the ethics committee) was provided and documented by the interviewer in telephone interviews where written consent could not be obtained.
A sequential mixed-methods approach across two phases was used to develop the attributes for the DCE (Fig 1). During Phase 1, we aimed to understand the different types of uncertainty that arise following CMA and/or ES. To do this we undertook a systematic review and conducted semi-structured interviews with parents and HPs. From this work, we developed a list of candidate attributes that were important to the study population and were capable of being traded. We also considered attributes used in existing DCE's within the field of genomics or prenatal testing as a means of cross-checking against our own list of attributes to identify gaps and inform attribute descriptions. Phase 2 focused on reducing the candidate attribute list to those considered most important to parents using quantitative and qualitative methods, then determining the number of levels and their content.

Phase 1: Attribute development
Systematic review. Following methodological recommendations for the development of a DCE [24], we began by identifying potential attributes in the relevant published literature. We conducted a mixed-methods systematic literature review of women's views and experiences of uncertainty in pregnancy following CMA or ES. We aimed to understand the different sources of uncertainty that were encountered and how that uncertainty was managed in the clinical setting [31]. Studies were included if they were: 1. Investigating pregnant women and partners' experiences of uncertainty through the process of having CMA or ES; 2. Using qualitative, quantitative, cross-sectional or mixed-methods research approaches; 3. Published in English in a peer-reviewed journal.
Studies were excluded if they were: 1. Investigating experiences of uncertainty not identified following CMA or ES, such as risk scores following Down syndrome screening, non-invasive prenatal testing or karyotyping; 2. Investigating parents' experiences following newborn or paediatric CMA and ES; 3. Examining views of uncertainty based on purely hypothetical scenarios; 4. A review, case report, abstract, editorial or commentary.
We searched three electronic databases (PubMed, PsycINFO and Embase) using relevant keywords (S1 Fig). The reference lists of eligible studies were searched, as well as other studies by JH. The initial search was conducted in October 2018. A further search was conducted in July 2019 and no additional papers were identified. The results of the identified studies were synthesised using the principles of thematic analysis [32] and meta-ethnography, which allows integration of findings across different study designs [33].
Qualitative interviews with key stakeholders. Semi-structured interviews were conducted with two groups of stakeholders: 1. HPs (clinical scientists, geneticists, genetic counsellors, fetal medicine consultants, obstetricians and paediatricians) from Australia, Denmark, the Netherlands, Singapore, Sweden and the UK working in prenatal testing with experience reporting or returning CMA and/ or ES results.
2. Parents from the UK and the Netherlands who had experienced uncertainty in their pregnancy following an anomalous fetal scan where the implications for the baby were unclear (and not suspected to be Down syndrome).
Parent participants in the UK were recruited using the social media pages of the charity Antenatal Results and Choices (ARC) and through Great Ormond Street Hospital (GOSH) in London. In the Netherlands, parent participants were recruited via a Clinical Geneticist at

PLOS ONE
Erasmus Medical Centre in Rotterdam. Interviews were conducted with stakeholders from two different countries to ensure the chosen attributes would be widely relevant. Interviews with parents were conducted by CL and JH in the UK, and by JEK in the Netherlands. Interviews with HPs were conducted by the co-authors in their respective countries, with the exception of Singapore where interviews were conducted by a co-author from the UK (Australia-EJS, Denmark-SL, The Netherlands-JEK, Sweden-CI-M, UK-CL and EH, Singapore-CL). Interviews were conducted in the native (or national in the case of Singapore) language (other than Sweden where they were conducted in English), then translated into English by members of the research team (who are bilingual and work in two languages in their daily professional capacity). Full details of sampling, recruitment and data analysis are published elsewhere [13,34].
Topic guide. Draft interview topic guides were developed by CL, JH and MH based on the findings of the systematic review and were revised with input from the wider research team (S2 and S3 Figs). Topic guides focused on what different types of uncertain results interviewees had come across, and how those results were managed. We also provided interviewees with a list of different types of uncertain results that had been identified from the systematic review (VUS, secondary findings, variants with reduced penetrance and variants with variable expression) and asked them whether these results should be fed back to parents.
Data analysis. Data were collected and analysed concurrently. Interviews were audiorecorded, transcribed verbatim and translated into English. Transcripts were coded and analysed using thematic analysis [32] using an abductive approach, which engages in a two-way dialogue between data and theory [35]. This approach was suitable for the qualitative analysis of this study, where we would be drawing together constructs from Han's taxonomy to explain and apply context and meaning to the data obtained [35]. The parent and HP interviews were analysed as two independent data sets. Data collection ceased when data saturation was reached, and no new themes or codes were emerging from the interview data. To ensure interresearcher reliability, two researchers coded and categorised both datasets and the findings were discussed by all members of the research team.
Integration of findings. To produce an initial long list of potential attributes, the findings of the systematic review and stakeholder interviews were collated and compared. We focused on identifying attributes that reflected the sources of uncertainty that can be experienced following prenatal genomic testing that could be quantifiable as attributes with multiple levels in a DCE. To aid in understanding, we referred to a taxonomy of uncertainty by Han et al [36]. We also included attributes related to the management of uncertain prenatal test results because these were considered important by stakeholders in coming to a decision [37] and could mitigate against the impact of the uncertainty (where management was well done) or could enhance the sense of uncertainty (where management was poorly done).
Consideration of attributes in other published DCEs. Attribute development was undertaken using an inductive approach whereby our attributes were derived through research conducted with key stakeholders. To check whether we had missed any relevant attributes and to consider how others had framed similar concepts in previous research, we also reviewed other DCEs in the fields of prenatal testing, CMA and/or exome/genome sequencing (see S1 Fig for list of search term used). Attributes from these DCEs were considered alongside those identified in Phase 1 of our study as a form of cross-checking.

Phase 2: Reducing the number of attributes and the development of levels
The long list of attributes was discussed with the research team, with the aim of removing any attributes that were 1) not quantifiable and therefore would not be feasible in the context of a DCE, 2) related to the condition being tested for rather than being an attribute of the test or its PLOS ONE delivery i.e. the condition is not an essential characteristic of the intervention, 3) not relevant to clinical practice i.e. they could not be used to guide recommendations for delivering prenatal genomic tests and dealing with uncertainty. The refined list of attributes was then reviewed by: 1. a sub-group of the parent participants from the UK and the Netherlands who had taken part in the stakeholder interviews in Phase 1; 2. a patient advocate from the support group Antenatal Results and Choices (ARC); and 3. parents who had had a pregnancy in the previous three years (who were known to the authors), and had not experienced uncertainty linked to an anomalous fetal scan during that pregnancy. We did this to seek representative views from women with different experiences of pregnancy We used a quantitative scoring exercise to rank the importance of each attribute. Similar quantitative approaches have been used in other DCEs to identify those attributes considered most important [38][39][40]. Each participant was presented with the list of attributes and asked to score the importance of each attribute on a scale of 1 (not important) to 5 (most important). As they were scoring each attribute, they were asked to verbalise ("think aloud") their decisionmaking process. This process was conducted either face-to-face or via telephone with one of the researchers (JH, CL or JEK), with qualitative and quantitative data captured on a score sheet.
The final step was to discuss the mean 'importance' scores with the research team and clinical advisory group to identify attributes that were the most relevant to uncertainty in a prenatal testing setting. For each of the final attributes, levels were chosen that represented a realistic range (as identified by the literature e.g. for diagnostic yield, or related to current practice e.g. for who returns results), over which DCE responders were expected to make trade-offs. Potential levels were discussed and agreed during a face-to-face meeting with our clinical advisory group.

Phase 1: Attribute development
Our systematic review identified fourteen studies (ten qualitative, four quantitative) that met our inclusion criteria [14]. These studies were set in the USA, UK, Australia and the Netherlands, and captured the views of 914 participants (678 women, 236 partners). Interview participants included 16 parents from the UK (n = 9) and Netherlands (n = 7) who had experienced uncertainty following the detection of an undiagnosed fetal anomaly, 11 of whom had gone on to have invasive testing (Table 1) and 25 HPs (clinical scientists, consultants in clinical genetics, obstetricians and genetic counsellors) from the UK (n = 6), the Netherlands (n = 6), Denmark (n = 5), Singapore (n = 4) and Australia (n = 4) ( Table 1).
Overall, 19 candidate attributes were identified from the systematic review and interviews ( Table 2). The candidate attributes were categorised as either 'Sources of uncertainty' (i.e. the type of uncertainty) or 'Management of uncertainty' (i.e. how the uncertainty is managed inside and outside the clinic including service-related issues). Nine candidate attributes were regarded as 'Sources of uncertainty'; of these seven attributes were drawn from more than one dataset (HP interviews, patient interviews and systematic review) and two were drawn from the HP interviews only ( Table 2). Ten candidate attributes were regarded as 'Management of uncertainty'. Of these, eight were drawn from both parent and HP datasets, one was drawn from the HP dataset, and one further attribute was drawn from the patient dataset.

Phase 2: Reducing the number of attributes and the development of levels
Following discussion with the research team, four attributes were removed because they were not considered to be quantifiable (genotype-phenotype correlations; how an anomaly with a well-known postnatal phenotype presents prenatally; technical validity of the test, and incomplete result). Two further attributes (penetrance and variable expressivity) were removed because the uncertainties related to the condition itself, rather than the genomic test (our DCE will ask participants to choose between two tests). The remaining thirteen attributes were presented to stakeholders who represented parent views. In the UK, nine participants reviewed the attributes: three had experienced uncertainty

PLOS ONE
during their pregnancy and had undergone invasive testing, five were parents who had not experienced uncertainty during pregnancy, and one was a parent advocate. In the Netherlands, seven participants reviewed the attributes: four had experienced uncertainty during pregnancy and had undergone invasive testing, and three were parents who had not experienced uncertainty during pregnancy. Six attributes were given a mean score of at least 4 (i.e. either important or very important). These were: Q4-length of time to get results (4.7), Q3a -secondary findings (of relevance to the baby) (4.6), Q11-communication style of HP delivering results (4.5), Q10-how results are delivered (4.4), Q1-diagnostic yield (4.4), and Q8-what type of HP delivers the results (4.0)( Table 3).
When discussing the length of time to get a result, participants felt that the "test should be done properly" but they "wouldn't want to wait longer than 1 week" for a result (UK P3), with one participant noting that "24 hours would be ideal" (UK P5). Regarding secondary findings (of relevance to the baby) parents stated that they would "want to know everything for the baby" (UK P5). However, one parent stated that wanting to know "would depend on the severity. If severe, then definitely" (Dutch P2). For HP communication style a "compassionate communication style in times of stress" (UK P4) was preferred. For how results are delivered, parents felt results should be "relayed back in layman's terms, no jargon" (UK P1). For which HP should deliver results, some parents noted a preference towards a genetics specialist returning results. However, one participant felt that who delivered results "was not the most important, as long everything you need to know is fed back" (UK P5), and another stated that the "same person should be involved throughout the process" (UK P2). How results were delivered could also depend on the severity of results: "If a good result, then via the phone. But if it's a bad result, then personal contact" (Dutch P4). The diagnostic yield of the test was frequently linked to the risk of miscarriage associated with invasive testing, with one participant stating that this factor is "even more important" when it is invasive (Dutch P4) and another saying that she "wouldn't put [herself] through it, if there wasn't a good chance of getting an answer" (UK P7).
Attributes that had lower mean importance scores included secondary findings (of relevance to parents) (3.2). One participant felt strongly that "they wouldn't want to know everything" (UK P2) when it came to secondary findings, and another echoed that they "wouldn't want to worry unnecessarily" about such findings (UK P3).
The cost of the test (1.8), including the role cost would play in determining whether one would choose to have a test or not, was also considered to be less important, regardless of which country the participant was from. Participants reported that, if required, they would be willing to pay for this test (ranging from £500 to £2000). However, for one participant, paying any price at all, could depend on the abnormality, which "If not considered severely debilitating, [she] wouldn't want to pay that much, if at all. But if it was a potentially serious condition, then [she] would pay" (UK P2). For others, their own financial circumstances could be a deciding factor.
When we compared the results of women who had experienced uncertainty with women who had not, five of the same six attributes had a mean score of at least 4. However, the attribute what type of HP delivers the results had a mean score of 3.5 amongst those who hadn't experienced uncertainty, possibly reflecting that the specialist who delivers the result is of less importance when that result is clear, easy to explain and no further investigations are required. When taking into consideration which country the women were from (UK or the Netherlands), the same six attributes had a mean score of at least 4. However, there were two attributes that had a mean score of at least 4 in the Netherlands but a mean score of less than 4 in the UK, namely notification about the identification of VUS (4.5 v 3.0) and who decides which results are fed back (4.0 v 3.3) (See Table 3).
The results of the scoring exercise were discussed with the research team. It was agreed that notification about the identification of VUS should be included in the DCE, despite only achieving a mean importance score of 3.3. This attribute captured a type of uncertainty that we were particularly interested in, as variability exists in terms of how these results are handled (as identified through the HP interviews), and for which it would be important to comment on in future recommendations. It was also agreed to exclude the attribute who decides what results are fed back due to its low mean importance score (3.2). The seven remaining attributes were then discussed with the clinical advisory group. They agreed that the attribute communication style of HPs delivering results should be excluded as HPs should always adopt an empathic style when speaking with patients, therefore the value placed on this particular attribute would not inform a change to clinical management. They also felt that the attribute how results are delivered should be excluded as these policies are generally decided at a departmental or hospital level and all patients receiving an uncertain result should be seen in person irrespective of how that result is initially delivered. The clinical advisory team agreed that all the attributes selected satisfied the essential characteristics of a DCE attribute in that they reflected the characteristics of prenatal genomic tests and their management, were considered important, were understandable and were mutually exclusive.
For each of the five remaining attributes, two to four clinically feasible levels were chosen that were grounded in reality yet in some cases i.e. diagnostic yield, represented the higher and lower ends of what was realistic to 'force' participants to make decisions and trade-offs. For example, the levels set for diagnostic yield were 5%, 30% and 60% as these represented the upper and lower limits of what has been found to be clinically feasible [3]. The final set of attributes and levels are presented in Table 4.

Discussion
Uncertainty is not uncommon in genomic medicine and new genomic technologies such as ES increase the potential for inconclusive test results as well as VUS and secondary findings PLOS ONE [44]. One of the key challenges in researching and addressing uncertainty in this context is conceptualising what uncertainty looks like and what it means for the target population. This paper describes the development of attributes for a DCE that will examine parents' preferences for tests that may reveal uncertain test results. Applying a mixed-methods approach, we undertook a qualitative analysis of the existing literature and interviewed parents and HPs to aid the development of attributes, using quantitative methods to refine the number of candidate attributes. The final attributes and levels were then agreed upon by an expert clinical group. The final list of attributes reflects multiple aspects of uncertainty in a prenatal setting and includes potential sources of uncertainty and issues linked to its management ( Table 4). The inclusion of VUS and secondary findings is timely because reporting guidelines and practices around whether these should be returned differ in this area both between and within different countries, as highlighted by our recent review of guidelines in this area [45], and HP and parent views have been reported to differ [46]. The likelihood of getting a result is also topical given that diagnostic yield has been found to vary considerably depending on whether the fetus has isolated or multiple anomalies [3]. Regarding time taken to receive results, parents waiting for ES results following the identification of a fetal anomaly have found the period long and trying [7] and studies have shown that some HPs including genetic counsellors as well as maternity HPs are concerned about returning these types of results to their patients and desire further guidance in this area [47][48][49].
Our international DCE employing these attributes will yield important and timely insights into which uncertain results should be returned to pregnant couples, and which attributes are most pressing when parents make decisions about prenatal genomic tests. In particular, we will identify the most important attribute to parents when making decisions and the relative importance of this attribute compared to the other attributes; whether there is heterogeneity in preferences across countries with differing cultures and healthcare systems, and across participant types (e.g. whether older women or women who have experienced uncertainty in a previous pregnancy place greater emphasis on certain attributes than others); and what proportion of women would not opt for an invasive test following receipt of an abnormal fetal anomaly scan result.
An important strength in the development of the attributes presented in this paper was the use of both qualitative and quantitative methods to identify attributes specifically related to uncertainty. By first conducting a systematic review on parents' experiences of uncertainty in the prenatal setting, we were able to identify a longlist of candidate attributes. However, identifying attributes and their levels exclusively on the basis of a literature review may lead to the non-inclusion of some important attributes [24]. Accordingly, we extended our findings from

PLOS ONE
the systematic review by conducting qualitative interviews with both parents and HPs. Triangulating parent and HP views enhances the study's credibility [50]. Our use of interviews addresses recommendations to include qualitative work in developing DCE attributes [24,25,28]. Whilst the majority of the attributes identified in our systematic review were found in the qualitative interviews (13 out of the initial list of 19), we did indeed identify six attributes that were not identified in the review, including three attributes identified through the interviews with parents (which is notable given that the review focused on the experience of parents). This highlights the importance of conducting qualitative work in the development of DCEs.
Coast and Horrocks highlight that a 'tension' can exist between the purpose of qualitative work (in obtaining deep understanding of the phenomenon) and the "reductive aim" of describing key concepts in as few attributes as possible [25]. Whilst it is possible that the attributes themselves do not do justice to the "complexity of the individuals' preferences" [25], we aimed to mitigate this issue by asking parent interviewees to review the attributes that were developed (member checking). Furthermore, including stakeholders from countries with differing cultures and healthcare systems increases the potential generalizability of our findings. Finally, we ensured that the attributes (and associated levels) satisfied the essential characteristics of a DCE attribute by validating them with a clinical advisory team.
Our method has several limitations. All but two parents from the UK were recruited from a parent support group (ARC), and may have had particularly negative experiences during their pregnancy that led them to seeking support. In addition, the parent sample recruited through ARC was relatively homogenous, particularly in terms of education level and gender. This may have impacted which attributes were considered most important, with for example, those considered most important to women being included in the final set. Further research with partners of women who have experienced uncertainty following a fetal anomaly would therefore be valuable. Furthermore, given the attributes selected reflect those considered most important to women, this in turn could impact those topics chosen for discussion by health professionals during the counselling session. It is therefore important that health professionals ensure the views and concerns of men are also identified and addressed.
We included women who had not experienced uncertainty during their pregnancy as a comparator, however these women were recruited via a convenience sample of individuals known to the researchers. This may have limited the representativeness of the sample, although there were few differences between the women who did and did not experience uncertainty in terms of their attribute preferences. Additionally, the sample size for the quantitative work was relatively small. Although agreement between the scores of parents who experienced uncertainty and those that hadn't was good, future studies could consider undertaking the scoring of potential attributes with a broader sample. This would enable the application of more complex quantitative methods to select attributes for the DCE, such as Rasch analysis or factor analysis." Another limitation is that only Dutch parents had direct experience of ES, so may have different experiences regarding uncertainty in prenatal testing. A further potential limitation is that we did not consider the inclusion of attributes that were related to the condition being tested for rather than being an attribute of the test or its delivery. Tolerance of uncertainty may be linked to condition severity and could consequently impact on uptake of prenatal genomic testing. However, our DCE will investigate preferences for test attributes and delivery of results, and will not generate information on potential test uptake. Finally, it could be argued that views regarding uncertainty may not be generalisable across countries with differing healthcare systems. However, both UK and Dutch participants had similar views regarding the most important attributes, as did our international research team and clinical advisory group.

Conclusions
We have described the development of attributes for a DCE assessing preferences towards receiving uncertain results from genomic testing. Using a mixed-methods approach, we have identified a set of five attributes for use in a DCE survey, with input from parents, HPs and experts in prenatal genomics. These have been used in a DCE survey that has been translated into multiple languages and recently used internationally to assess and compare tolerance for uncertainty in prenatal testing, the results of which are currently being analysed.